home *** CD-ROM | disk | FTP | other *** search
- dis86 - Interactive 8086 Disassembler
- James R. Van Zandt
-
- SYNOPSIS
-
- Dis86 is a full-screen, interactive disassembler of object code for the
- 8086, 8087, 8088, 80186, 80286, and 80386 (products of Intel), and the
- V20 and V30 (products of NEC). The 80386 disassemblies include 32 bit
- operands and addresses. Dis86 implements the concept of a "current
- location" and allows use of the cursor keys to change it. Code can
- come from a .EXE file (in which case the header is properly
- interpreted), any other file (assumed to have no header), or anywhere
- in main memory (0000:0000 - F000:FFFF). Dis86 can install changes,
- even in an .EXE file, making it a convenient way to install patches.
- Versions are available for the IBM PC (and clones) and Z-100.
-
-
- STARTING THE DISASSEMBLER
-
- To disassemble a file, give the file name (optionally preceded by a path
- name) on the command line:
-
- A>dis86 foo.exe
-
- To disassemble from RAM, use an empty command line:
-
- A>dis86
-
- There are no command line switches.
-
-
- FILE HEADER INFORMATION
-
- For a .EXE file, the information in the file header will be displayed when
- the program is first run and in response to the H command (see below).
-
-
- DISPLAY SCREEN
-
- During disassembly, the screen will resemble the following:
-
- 0000:0100 e9 01 90 jmp 9104
- 0000:0103 55 push bp
- 0000:0104 8b ec mov bp,sp
- 0000:0106 83 ec 0e sub sp,0e
-
- ...
-
- 0000:012C 50 push ax
- 0000:012D b8 69 00 mov ax,0069
- 0000:0130 50 push ax
- 0000:0131 e8 e9 5c call 5e1d
- dis86 1.00 - A SHAREWARE software product (c) 1986, James R. Van Zandt
- >
- ... 0000:0100 0000:0100 0000:0100
-
- Lines 1 through 21 are the disassembled code. Each line starts with
- the current address, followed by the actual bytes being disassembled.
- The rest of the line is the assembly language equivalent, if any, of
- the code. The display for A (ASCII), B (byte), and D (data) formats is
- similar. All numbers are shown in hexadecimal.
-
- Line 22 is a message and prompt line showing, for example, the
- arguments needed for some commands. Line 23 has the prompt. Typed
- characters are echoed on the rest of this line. Line 24 has three
- addresses, which are the first three entries in the stack (see the
- 'cursor right' and 'cursor left' commands below).
-
-
- CURSOR KEYS
-
- The "current location" is the address displayed on the first line
- of disassembly. The cursor keys are used to adjust the current
- location.
-
- The up and down cursor keys (8 and 2 on the numeric pad) are used to
- move the current location a small amount (note that they are not
- inverses):
-
- <up> moves up by one byte (lower address)
- <down> moves down by one line (higher address)
-
-
- The <pg up> and <pg dn> keys (9 and 3 on the numeric pad) move the
- current location by larger amounts. (These will not move the cursor
- out of the disassembly buffer. Otherwise, they are inverses.):
-
- <pg up> moves up by 32 bytes (lower address)
- <pg dn> moves down by 32 bytes (higher address)
-
-
- The above keys change only the current location. Other commands change
- the current location by potentially large amounts, but first save it in
- a stack. The first three addresses in the stack are shown on the
- command line at the bottom of the screen.
-
- If the instruction at the current location is a jump, call, or a
- reference to a data location, the cursor right key (6 on the numeric pad)
- will push the current location on the stack and go to the referenced
- location. For a data reference, the disassembly format is changed to D
- (hex and ASCII).
-
- <right> follows a jump, call, or data reference
-
-
- The cursor left or left arrow key (4 on the numeric pad) will pop the
- last address off the stack. Note that right arrow followed by left
- arrow will return you to the same address, whereas left arrow
- (returning, let us say, to address X) followed by right arrow will only
- return you to the same address if there is an appropriate jump or call
- at X.
-
- <left> pops address stack
-
- Aƻfter using the right arrow or one of the commands A, B, C, D, or G (in
- next section) to go to a new address, and using the left arrow key to
- pop the stack, you will sometimes want to return to the previous
- address. The stack no longer holds the address. However, the left
- arrow key saves the current location in a special "previous state"
- before popping the stack.
-
- To return to the address stored in the "previous state", type shift
- right arrow on a Z-100, or control right arrow on an IBM PC.
-
- <shift><right> returns to "previous state" (Z-100)
- <cntrl><right> returns to "previous state" (IBM)
-
-
- In summary, the unshifted keys on the numeric pad are:
-
- <home> top of file ^ up 1 byte <pg up> up 32 bytes
- |
-
- <-- pop addr stack --> follow jump/call
-
- |
- <end> end of file v down 1 line <pg dn> down 32 bytes
-
-
- <ins> setup options
-
- On the Z-100, the four keys with arrows on them may be used in addition
- to the 2, 4, 6, and 8 on the numeric pad.
-
-
- LETTER COMMANDS FOR MOVING THE CURSOR
-
- There are five letter commands to change the display format and/or
- disassembly address:
-
- A ASCII data
- B byte data (hex)
- D data (both hex bytes and ASCII)
- C code
- G goto
-
- These commands may be in upper or lower case. Each may be followed by:
-
- <ret> Only the display format changes.
-
- A <expression> <ret>
- The current location changes to the specified address.
-
- S <expression> <expression> <expression> <ret>
- The disassembler searches from the current
- address to the end of the buffer for the
- specified sequence of hex bytes. If an
- expression has a segment specified using the
- ':' operator (below), the segment is ignored.
-
- S T [string] <ret>
- The disassembler searches from the current
- address to the end of the buffer for the
- specified ASCII string. Cases are not
- distinct, and the high order bit is ignored.
- The string can also be introduced by a double
- quote.
-
- S R <expression> <ret>
- The disassembler searches from the current
- address to the end of the buffer for a
- reference (jump or call) to the specified
- address.
-
- An <expression> can involve any of these items:
-
- hex numbers (either upper or lower case letters)
- cs, ds, es, ss, fs, gs
- currently assumed segment register values
- $ current location
- @ offset of top address on the stack
-
- ...and any of these operators:
-
- + - * / add, subtract, multiply, divide
- : separate segment and offset
-
- Note that G with no address is a noop.
-
-
- OPTIONS
-
- The 'O' command or <ins> (0 on the numeric pad) brings up menus for
- changing setup options and allows the user to reset the disassembly
- window. Use <space> or <esc> to move to the next screen.
-
- The first menu allows the user to select the processor which is
- supposed to execute the code. There is some conflict in op codes
- between the V20 and V30 on one hand and the 80286 and 80386 on the
- other. That is, the two families use the same op codes for different
- instructions. Dis86 selects the instruction appropriate for the chip
- shown in this menu. In addition, instructions not implemented by the
- indicated chip will be flagged. The other item on the first menu lets
- the user specify 16 or 32 bit mode for the 80386. In the 16 bit mode
- the 80386 is similar to the 8086. In the 32 bit mode arithmetic is
- performed in 32 bit registers and all address offsets are 32 bits.
- (The 80386 itself selects the mode based on a bit in the segment table
- entry for the code segment.)
-
- The second menu allows the user to indicate the byte value which matches
- any byte in a byte or character search (the "wild card" byte) and select
- the number of bytes displayed on each line for the A, B, or D formats.
- The latter value can also be set using the W command.
-
- The last options display is a small map of the code being disassembled
- which will resemble the following:
-
- ds= -10
- cs=0000
- | ss=0960
- es= -10 |
- | cursor=0000:0453 |
- CCCCCCCCCCCCCCcccccccccccccc
- ^0000:0000
- ^0000:6144
-
- The Cs represent the code being disassembled. The capital Cs are the
- portion of code in the disassembly window (see discussion below). The
- assumed values for the segment registers, the current location (labeled
- "cursor"), and the beginning and end addresses of the disassembly
- window are also shown. The window can be adjusted using the right and
- left cursor keys.
-
- By entering the options menu with the <ins> key and stepping from one
- menu to the next with <ret>, you can leave your right hand on the
- numeric pad.
-
-
- MISCELLANEOUS COMMANDS
-
- The 'P' command is used to print a disassembly listing to a file. The
- first time this command is used, it prompts for a file name. The
- default file name is "printout". To actually send the listing to a
- printer, specify the filename "prn". If the file already exists the
- new information will be appended. The file is automatically closed
- before the disassembler exits. The command also prompts for the
- beginning and end addresses of the code to be printed. The default
- addresses print the current screen. When the printing is finished, the
- current address is advanced to the first byte not printed. Thus, you
- can repeat the sequence
-
- P <ret> <ret>
-
- to print a large section.
-
- Enter 'R' to display and/or change the assumed segment register values.
- Entries may be full expressions. For example, to copy the value from SS
- into DS, use the cursor keys to select the DS register and type "ss".
-
- The 'S' command selects a new segment register value for displaying
- addresses. The new register is shown on the message line. The actual
- address being disassembled is not changed (see "segmentation" below).
-
- The 'W' command is used to set the number of bytes displayed on each
- line for the A, B, and D formats. This is useful for displaying
- tables. For example, when dis86 is executed without a file, it
- displays bytes starting at address 0000:0000 and the width is set to
- four so each interrupt vector is shown on a separate line.
-
- Type '?' to get a series of help screens. Type <esc> to return to the
- disassembly, or any other key to advance to the next screen
-
- The 'E' command allows the user to modify the program being
- disassembled. Changes are initially made only in the disassembly
- buffer. Before the buffer is overwritten or the disassembler
- terminates, the user is asked whether the changes are to be written to
- the file or RAM area being disassembled.
-
- Enter 'Q' to stop the disassembler.
-
-
- TYPING REQUESTED DATA
-
- Many commands supply default entries for requested data. If you decide
- to accept the default, just enter <ret>. For editing entries,
- you can position the cursor using the left and right cursor keys to
- move by one character, <home> (7 on the numeric pad) to move to the
- left end of the string, or <end> (1 on the numeric pad) to move to the
- right end. Use the <del> or <backspace> keys to delete incorrect
- characters, or just type characters to be inserted. (There is no
- "replace" typing mode.) In every case but one, you can also edit the
- default entry by making <right>, <end>, or <del> your first keystroke.
- The exception is the default for the byte search function.
-
-
- DISASSEMBLY WINDOW
-
- The disassembler uses a buffer to hold the code being disassembled.
- For most purposes, this disassembly window is transparent to the user.
- If the user requests an address within the file but outside the
- disassembly window, the appropriate code is automatically read in. The
- existence of the window is apparent in only three cases:
-
- 1. If the disassembler is started near the end of the window and
- reaches the end before it fills the screen, the rest of the
- screen will be left blank.
-
- 2. The searches are done only from the current location to the end
- of the buffer.
-
- 3. If the contents of the buffer has been changed (see 'E'
- command) they are optionally written out before being
- overwritten.
-
-
- LOAD ADDRESS
-
- Code from a .COM file is displayed as though its Program Segment Prefix
- were at 0000:0000 and its load address were 0000:0100.
-
- Code from a .EXE file is displayed as though its load address were
- 0000:0000. This puts its Program Segment Prefix is 10 paragraphs or
- 100 (hex) bytes lower. This is somewhat awkward, because the DS and ES
- registers are initialized to point to the PSP. The disassembler
- displays this segment value as -10. The advantage of a load address of
- 0000:0000 is that no relocation is necessary. The bytes displayed are
- exactly the same as those in the file. This also means that the code
- can be modified (see below for the 'E' command) and written back to the
- file without being "unrelocated".
-
-
- SEGMENTATION
-
- Addresses are displayed in segment:offset form, using the current
- assumed value of the current segment register. The current segment
- register can be selected using the 'S' command to step among the
- available registers (CS, SS, DS, ES, FS, and GS - the last two only
- with 80386 code). Changing segment registers or their values does not
- move the disassembler cursor. Only the displayed segment and offset
- values will change to reflect the new assumptions. Legal offsets will
- be displayed as a four digit hex number (0000 to FFFF). Other offsets
- (negative or greater than 64K) will also be calculated and displayed
- correctly, although they are illegal on the 8086. Illegal offsets will
- have more than four digits.
-
- The segment register values are initialized as indicated in the file
- header (for .EXE files) or to zero (for other files or RAM). The
- disassembler has no way of determining the values which may be set
- during execution. For example, the initialization code for DeSmet C
- programs resets DS to the same value as the initial SS before executing
- main().
-
- The assumed segment register values can be altered in two ways. Any
- segment register can be changed using the register menu reached by the
- 'R' command. In addition, when the right arrow key is used to follow a
- far call or jump, the new code segment value is loaded into the CS
- register. When the user specifies a new segment value on an A, B, C, D,
- or G command, that value is used for subsequent displays but none of the
- assumed segment register values is changed.
-
- The segmentation models of the protected modes of the 80286 and 80386
- are not supported.
-
-
- ALIGNMENT
-
- Dis86 will correctly disassemble code if started on the first byte of
- an instruction. If started in the middle of an instruction, it will
- disassemble that instruction and perhaps several more incorrectly. In
- this case the disassembler is said to be out of alignment with the
- object code. The disassembler will tend to correct its alignment if it
- continues long enough. 8086 instructions tend to be longer than, for
- example, those for the 8080, so the disassembler will tend to stay out
- of alignment for more instructions. Generally speaking, the alignment
- will be correct after the first half dozen lines.
-
-
- SUMMARY
-
- Here are all the letter commands:
-
- A nnnn ASCII data
- B nnnn byte data (hex)
- C nnnn code (disassembly)
- D nnnn data (hex and ASCII)
- E enter new data (follow with a hex expression for each new byte)
-
- G nnnn goto address nnnn
- H display file header information (.EXE files only)
- O change setup options
- P print disassembly listing to file
- Q quit to DOS
-
- R change segment register values
- S select new segment register
- W set bytes of data per line for A, B, and D formats
- X exchange current address (at top of screen) with top of stack
- ? display help screens
-
-
-
- EXAMPLE 1
-
- In the examples, <left>, <right>, <up>, and <down> refer to the four
- cursor keys (4, 6, 8, and 2 on the numeric pad, plus the four arrow keys
- on the Z-100 keyboard). <pg up> and <pg dn> refer to 9 and 3 on the
- numeric pad.
-
- To investigate the bootstrap code, type
-
- A>dis86 <ret>
-
- and press
-
- <space>
-
- to advance to the disassembly display (which will be the interrupt
- vectors). Next type
-
- c a ffff:0000 <ret>
-
- (for Code format at the Address ffff:0000). On an IBM, the ROM release
- date and machine ID appear in the last 16 bytes of the ROM. To see them,
- type
-
- D <ret>
-
- The release data is at addresses ffff:0005 - ffff:000c in ASCII. The
- machine ID is at ffff:000e. Some of the possible values are:
-
- ff IBM PC
- fe IBM XT and Portable IBM PC
- fd IBM PCjr
- fc IBM AT
- 2d Compaq
- 9a Compaq-Plus
-
- Return to code format by typing
-
- C <ret>
-
- One of the instructions displayed will almost certainly be a jump. If
- so, press
-
- <down>
-
- enough times to bring the jump to the top line, then
-
- <right>
-
- to follow the jump. Note that the previous addresses were pushed onto
- the stack, as shown on the bottom line. To return to the most recent
- address, press
-
- <left>
-
- To leave the disassembler, press
-
- Q
-
-
- EXAMPLE 2
-
- For a second example, let us disassemble the disassembler itself.
- Begin by typing
-
- A>dis86 dis86.exe <ret>
-
- Note the header information, including the entry point of 0000:0000 and
- the initial stack location of approximately 09e0:9eb8. Proceed to the
- disassembly screen by typing
-
- <space>
-
- The disassembler starts in C (code) format at the entry point, which is
- a jump to the initialization code. To follow the jump, type
-
- <right>
-
- One of the early instructions in the initialization code refers to the
- first location in the stack segment. Bring this location to the top of
- the screen by typing
-
- <pg dn> <down> <down>
-
- and follow the reference by typing
-
- <right>
-
- Since it was a data reference, the disassembler automatically switched
- to D (data) format. Note that the two previous addresses have been
- pushed onto the stack, as shown at the bottom of the screen. Return to
- the most recent one by typing
-
- <left>
-
- The initialization code gets rather involved, but one of its functions
- is to initialize DS to the same value as SS. To reflect this, use the
- R command:
-
- R
-
- DS is the first register in the list, so you need only enter the
- appropriate value:
-
- ss <ret> <space>
-
- The code for the main program immediately followed the jump at
- 0000:0000. To return there, type
-
- <left>
-
- Send a copy of this screen to the file "printout" by typing
-
- P <ret> <ret> <ret>
-
- To inspect the data segment, type
-
- A ds:0 <ret>
-
- To display more characters on each line, use the W command:
-
- W 60 <ret>
-
- Use the search command to find one of the messages:
-
- G S T hime <ret>
-
- This string won't be found. To correct the spelling to "home" and try
- again, type
-
- G S T <right> o <del> <ret>
-
- Once again, leave the disassembler by pressing
-
- Q
-
-
-
-
-